aei_no_std <- read.csv("/Volumes/RachelExternal/Thesis/Data_upload_for_CL/AEI_NoStd.csv")
We’ve got a lot of things here to log, scale or center.
What do I mean by scaling or centering?
Scaling: Shifting the range of the predictor between [0,1] using the formula: \(\frac{y_i}{max(y)}\)
Centering: Centering the mean of the predictor on 0 or 1 using the formula: \(\frac{y_i-mean(y)}{sd(y)}\)
Why must I do this?
Well, a couple of reasons. First and foremost it makes the specification of priors easier, as the distribution of the parameter already centered around the mean and most of the data points are contained within one standard deviation to each side. Another reason is that it makes the interpretation of the coeficients a bit easier, as you can clearly tell which have positive or negative effects.
Lets plot what these look like with out being transformed. These graphs are also interactive, feel free to click around.
Some things of note here: I will be using income instead of total GDP, median Humidity and PET instead of Average, and Humidity with the Inf/NaN values replaced.
A lot of things here are skewed left, which is to be expected as majority of countries are smaller rather than bigger in all aspects. There is little noticeable difference between the regions in many of the predictor variables. Income has trends as you would expect, with Europe having higher incomes and Sub Saharan Africa having lower, with some of the high outliers being from North Africa and the Middle East.There are bigger regional differences in Humidity and PET (remember these two are related \(Humidity = Precip/PET\)). Precip looks similar to Humidity (again, similarities were expected). Ruggedness is tough to find regional trends visually.
What about crop fractions?